Instance Smoothed Contrastive Learning for Unsupervised Sentence Embedding
نویسندگان
چکیده
Contrastive learning-based methods, such as unsup-SimCSE, have achieved state-of-the-art (SOTA) performances in learning unsupervised sentence embeddings. However, previous studies, each embedding used for contrastive only derived from one instance, and we call these embeddings instance-level In other words, is regarded a unique class of its own, which may hurt the generalization performance. this study, propose IS-CSE (instance smoothing embedding) to smooth boundaries feature space. Specifically, retrieve dynamic memory buffer according semantic similarity get positive group. Then group are aggregated by self-attention operation produce smoothed instance further analysis. We evaluate our method on standard text (STS) tasks achieve an average 78.30%, 79.47%, 77.73%, 79.42% Spearman’s correlation base BERT-base, BERT-large, RoBERTa-base, RoBERTa-large respectively, 2.05%, 1.06%, 1.16% 0.52% improvement compared unsup-SimCSE.
منابع مشابه
Learning Contrastive Connectives in Sentence Realization Ranking
We look at the average frequency of contrastive connectives in the SPaRKy Restaurant Corpus with respect to realization ratings by human judges. We implement a discriminative n-gram ranker to model these ratings and analyze the resulting n-gram weights to determine if our ranker learns this distribution. Surprisingly, our ranker learns to avoid contrastive connectives. We look at possible expla...
متن کاملSupervised and Unsupervised Learning for Sentence Compression
In Statistics-Based Summarization Step One: Sentence Compression, Knight and Marcu (Knight and Marcu, 2000) (K&M) present a noisy-channel model for sentence compression. The main difficulty in using this method is the lack of data; Knight and Marcu use a corpus of 1035 training sentences. More data is not easily available, so in addition to improving the original K&M noisy-channel model, we cre...
متن کاملInstance Embedding Transfer to Unsupervised Video Object Segmentation
We propose a method for unsupervised video object segmentation by transferring the knowledge encapsulated in image-based instance embedding networks. The instance embedding network produces an embedding vector for each pixel that enables identifying all pixels belonging to the same object. Though trained on static images, the instance embeddings are stable over consecutive video frames, which a...
متن کاملMILEAGE: Multiple Instance LEArning with Global Embedding
Multiple Instance Learning (MIL) generally represents each example as a collection of instances such that the features for local objects can be better captured, whereas traditional learning methods typically extract a global feature vector for each example as an integral part. However, there is limited research work on investigating which of the two learning scenarios performs better. This pape...
متن کاملSmoothed Dual Embedding Control
We revisit the Bellman optimality equation with Nesterov’s smoothing technique and provide a unique saddle-point optimization perspective of the policy optimization problem in reinforcement learning based on Fenchel duality. A new reinforcement learning algorithm, called Smoothed Dual Embedding Control or SDEC, is derived to solve the saddle-point reformulation with arbitrary learnable function...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2023
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v37i11.26512